Introduction to Computer Vision: Plant Seedlings Classification¶

Problem Statement¶

Context¶

In recent times, the field of agriculture has been in urgent need of modernizing, since the amount of manual work people need to put in to check if plants are growing correctly is still highly extensive. Despite several advances in agricultural technology, people working in the agricultural industry still need to have the ability to sort and recognize different plants and weeds, which takes a lot of time and effort in the long term. The potential is ripe for this trillion-dollar industry to be greatly impacted by technological innovations that cut down on the requirement for manual labor, and this is where Artificial Intelligence can actually benefit the workers in this field, as the time and energy required to identify plant seedlings will be greatly shortened by the use of AI and Deep Learning. The ability to do so far more efficiently and even more effectively than experienced manual labor, could lead to better crop yields, the freeing up of human inolvement for higher-order agricultural decision making, and in the long term will result in more sustainable environmental practices in agriculture as well.

Objective¶

The aim of this project is to Build a Convolutional Neural Netowrk to classify plant seedlings into their respective categories.

Data Dictionary¶

The Aarhus University Signal Processing group, in collaboration with the University of Southern Denmark, has recently released a dataset containing images of unique plants belonging to 12 different species.

  • The dataset can be download from Olympus.
  • The data file names are:
    • images.npy
    • Labels.csv
  • Due to the large volume of data, the images were converted to the images.npy file and the labels are also put into Labels.csv, so that you can work on the data/project seamlessly without having to worry about the high data volume.

  • The goal of the project is to create a classifier capable of determining a plant's species from an image.

List of Species

  • Black-grass
  • Charlock
  • Cleavers
  • Common Chickweed
  • Common Wheat
  • Fat Hen
  • Loose Silky-bent
  • Maize
  • Scentless Mayweed
  • Shepherds Purse
  • Small-flowered Cranesbill
  • Sugar beet

Note: Please use GPU runtime on Google Colab to execute the code faster.¶

Importing necessary libraries¶

In [1]:
# Installing the libraries with the specified version.
# uncomment and run the following line if Google Colab is being used
!pip install tensorflow==2.15.0 scikit-learn==1.2.2 seaborn==0.13.1 matplotlib==3.7.1 numpy==1.25.2 pandas==1.5.3 opencv-python==4.8.0.76 -q --user
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 475.2/475.2 MB 2.7 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.6/9.6 MB 85.0 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.2/18.2 MB 87.7 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12.1/12.1 MB 112.8 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.7/61.7 MB 10.5 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.7/1.7 MB 47.8 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.0/1.0 MB 57.5 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.5/5.5 MB 101.7 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 442.0/442.0 kB 34.1 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 77.9/77.9 kB 6.1 MB/s eta 0:00:00
  WARNING: The scripts f2py, f2py3 and f2py3.10 are installed in '/root/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The script tensorboard is installed in '/root/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The scripts estimator_ckpt_converter, import_pb_to_tensorboard, saved_model_cli, tensorboard, tf_upgrade_v2, tflite_convert, toco and toco_from_protos are installed in '/root/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
cudf-cu12 24.4.1 requires pandas<2.2.2dev0,>=2.0, but you have pandas 1.5.3 which is incompatible.
google-colab 1.0.0 requires pandas==2.1.4, but you have pandas 1.5.3 which is incompatible.
mizani 0.11.4 requires pandas>=2.1.0, but you have pandas 1.5.3 which is incompatible.
pandas-stubs 2.1.4.231227 requires numpy>=1.26.0; python_version < "3.13", but you have numpy 1.25.2 which is incompatible.
plotnine 0.13.6 requires pandas<3.0.0,>=2.1.0, but you have pandas 1.5.3 which is incompatible.
tensorstore 0.1.65 requires ml-dtypes>=0.3.1, but you have ml-dtypes 0.2.0 which is incompatible.
tf-keras 2.17.0 requires tensorflow<2.18,>=2.17, but you have tensorflow 2.15.0 which is incompatible.
xarray 2024.9.0 requires pandas>=2.1, but you have pandas 1.5.3 which is incompatible.
In [ ]:
# Installing the libraries with the specified version.
# uncomment and run the following lines if Jupyter Notebook is being used
#!pip install tensorflow==2.13.0 scikit-learn==1.2.2 seaborn==0.11.1 matplotlib==3.3.4 numpy==1.24.3 pandas==1.5.2 opencv-python==4.8.0.76 -q --user
In [2]:
#mounting the Google drive to access the datasets
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive
In [42]:
import os
import numpy as np                                                                               # Importing numpy for matrix operations
import pandas as pd                                                                              # Importing pandas manipulate DataFrames
import matplotlib.pyplot as plt                                                                  # Importing matplotlib for plotting and visualizing images
import math                                                                                      # Importing math module to perform mathematical operations
import cv2                                                                                       # Importing openCV for image processing
import seaborn as sns                                                                            # Importing seaborn to plot graphs

#Tensorflow modules
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator                              # Importing for data augmentation
from tensorflow.keras.models import Sequential                                                   # Importing to define a sequential model
from tensorflow.keras.layers import Dense,Dropout,Flatten,Conv2D,MaxPooling2D,BatchNormalization # Defining all the layers to build our CNN models
from tensorflow.keras.optimizers import Adam,SGD                                                 # Importing the optimizers for the models
from sklearn import preprocessing                                                                # Importing to preprocess the data
from sklearn.model_selection import train_test_split                                             # Importing train_test_split function to split the data into train and test
from sklearn.metrics import confusion_matrix                                                     # Importing to plot confusion matrices
from sklearn.metrics import classification_report                                                # Importing to plot classification reports for the models

#display images using OpenCV
from google.colab.patches import cv2_imshow
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import os
import math
import cv2

import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential, Model
from keras.applications.vgg16 import VGG16
from tensorflow.keras.layers import Conv2D, MaxPooling2D, BatchNormalization, Flatten, Dense, Dropout
from tensorflow.keras.optimizers import Adam, SGD
from sklearn import preprocessing
from sklearn.preprocessing import LabelBinarizer
from sklearn.model_selection import train_test_split
from sklearn import metrics
from sklearn.metrics import confusion_matrix
from tensorflow.keras import backend
import random
from keras.callbacks import ReduceLROnPlateau, EarlyStopping

# to display images in colab using OpenCV
from google.colab.patches import cv2_imshow
In [4]:
import warnings
warnings.filterwarnings('ignore')

#formating numeric data for easier readability
pd.set_option(
    "display.float_format", lambda x: "%.2f" % x
)  # to display numbers rounded to 2 decimal places

Note: After running the above cell, kindly restart the notebook kernel and run all cells sequentially from the start again.

In [ ]:
 

Loading the dataset¶

In [5]:
# Uncomment and run the below code if you are using google colab
# from google.colab import drive
drive.mount('/content/drive')
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
In [8]:
#loading the dataset of images
path1 = "/content/sample_data/images.npy"
images = np.load(path1)

#loading the dataset of labels
path2 = "/content/sample_data/Labels.csv"
labels = pd.read_csv(path2)
In [ ]:
 

Data Overview¶

In [9]:
print(images.shape)
print(type(images))
(4750, 128, 128, 3)
<class 'numpy.ndarray'>

The dataset contains 4750 RGB images of the shape 128x128 pixels, each containing 3 channels

In [10]:
labels.sample(n=5)
Out[10]:
Label
3441 Sugar beet
3777 Maize
3527 Sugar beet
2321 Charlock
3493 Sugar beet

Understand the shape of the dataset¶

In [18]:
images.shape
Out[18]:
(4750, 128, 128, 3)

Exploratory Data Analysis¶

  • EDA is an important part of any project involving data.
  • It is important to investigate and understand the data better before building a model with it.
  • A few questions have been mentioned below which will help you understand the data better.
  • A thorough analysis of the data, in addition to the questions mentioned below, should be done.
  1. How are these different category plant images different from each other?
  2. Is the dataset provided an imbalance? (Check with using bar plots)
In [11]:
#setting the figure size for the visualizations
from matplotlib import rcParams

rcParams['figure.figsize'] = 15,10
In [12]:
#plotting the distribution of the number of each plant's images
ax = sns.countplot(data=labels, x="Label", palette='viridis', order = labels['Label'].value_counts().index)
total = float(len(images))
plt.xticks(rotation=45)
for p in ax.patches:
    percentage = '{:.1f}%'.format(100 * p.get_height()/total)
    x = p.get_x() + p.get_width()/2
    y = p.get_height() + 6
    ax.annotate(percentage, (x, y), ha='center')
plt.show()
In [13]:
#displaying how many images of each plant are in the dataset
labels.value_counts()
Out[13]:
count
Label
Loose Silky-bent 654
Common Chickweed 611
Scentless Mayweed 516
Small-flowered Cranesbill 496
Fat Hen 475
Charlock 390
Sugar beet 385
Cleavers 287
Black-grass 263
Shepherds Purse 231
Common wheat 221
Maize 221

In [ ]:
 

Of the 12 types of plants in the dataset:

Two plants, loose silky-bents and common chickweeds, comprise over a quarter (~26%) of all images

The five plants with the most images comprise 58% of all images

The number of images for each plant ranges from 221 to 654, close to three times more images for the model to train with

Since the distribution of the number of images for each plant is not uniform, there is the potential for the CNN model to overfit for the more well-represented plants

Due to the nature of this project, this imbalance may also contribute to the model misclassifying less well-represented plants as better represented plants with similar shapes, colors, and other applicable traits

In [14]:
#listing the names of each class of plant
classes = np.unique(labels).tolist()
classes
Out[14]:
['Black-grass',
 'Charlock',
 'Cleavers',
 'Common Chickweed',
 'Common wheat',
 'Fat Hen',
 'Loose Silky-bent',
 'Maize',
 'Scentless Mayweed',
 'Shepherds Purse',
 'Small-flowered Cranesbill',
 'Sugar beet']
In [15]:
#storing the number of classes in a variable
num_classes = len(classes)
num_classes
Out[15]:
12
In [16]:
def plot_images(images,labels):
  num_classes=12                                                                #setting the number of classes
  categories=np.unique(labels)
  keys=dict(labels['Label'])                                                    #obtaining the unique classes from labels
  rows = 3                                                                      #defining number of rows
  cols = 4                                                                      #defining number of columns
  fig = plt.figure(figsize=(15, 12))                                            #defining the figure size
  for i in range(cols):
      for j in range(rows):
          random_index = np.random.randint(0, len(labels))                      #generating random indices from the data and plotting the images
          ax = fig.add_subplot(rows, cols, i * rows + j + 1)                    #adding subplots
          ax.imshow(images[random_index, :])                                    #plotting the image
          ax.set_title(keys[random_index])
  plt.show()
In [17]:
#showing a sample of 12 images from the dataset
plot_images(images,labels)
In [19]:
def labeled_barplot(data, feature, perc=False, n=None):
  '''
  Barplot with percentage at the top
  data: dataframe
  feature: dataframe column
  perc:
  whether to display percentages instead of count (default is False)
  n: displays the top n category levels (defaul is None, i.e., display all levels)
  '''
  total = len(data[feature]) #length of the column
  count = data[feature].nunique()
  if n is None:
    plt.figure(figsize=(count + 2, 6))
  else:
    plt.figure(figsize=(n + 2, 6))
  plt.xticks(rotation=90, fontsize=15)
  ax = sns.countplot(
      data=data,
      x=feature,
      palette='Paired',
      order=data[feature].value_counts().index[:n]
  )
  for p in ax.patches:
    if perc == True:
      label = '{:.1f}%'.format(
          100*p.get_height()/total
      ) # percentage of each class of the category
    else:
      label = p.get_height() # count of each level of the category
    x = p.get_x() + p.get_width()/2 # width of the plot
    y = p.get_height() # height of the plot

    #annotate the percentage
    ax.annotate(
        label,
        (x, y),
        ha='center',
        va='center',
        size=12,
        xytext=(0,5),
        textcoords='offset points',
    )
plt.show()
In [20]:
# visualize the balance of the dataset
sns.countplot(data = labels, x='Label');
plt.xticks(rotation=90)
plt.show()
In [21]:
# number of labels for each plant type
labels['Label'].value_counts()
Out[21]:
count
Label
Loose Silky-bent 654
Common Chickweed 611
Scentless Mayweed 516
Small-flowered Cranesbill 496
Fat Hen 475
Charlock 390
Sugar beet 385
Cleavers 287
Black-grass 263
Shepherds Purse 231
Common wheat 221
Maize 221

In [22]:
# check the percentage of each label value
for i in labels['Label'].unique():
  print(i, ": ", (labels[labels['Label']==i].count()/labels.shape[0])*100, "\n")
Small-flowered Cranesbill :  Label   10.44
dtype: float64 

Fat Hen :  Label   10.00
dtype: float64 

Shepherds Purse :  Label   4.86
dtype: float64 

Common wheat :  Label   4.65
dtype: float64 

Common Chickweed :  Label   12.86
dtype: float64 

Charlock :  Label   8.21
dtype: float64 

Cleavers :  Label   6.04
dtype: float64 

Scentless Mayweed :  Label   10.86
dtype: float64 

Sugar beet :  Label   8.11
dtype: float64 

Maize :  Label   4.65
dtype: float64 

Black-grass :  Label   5.54
dtype: float64 

Loose Silky-bent :  Label   13.77
dtype: float64 

In [23]:
# Visualize the balance/distribution of the label values.
labeled_barplot(labels, "Label", perc=True)

Data Pre-Processing¶

Convert the BGR images to RGB images.¶

In [24]:
for i in range(len(images)):
  images[i] = cv2.cvtColor(images[i], cv2.COLOR_BGR2RGB)
In [25]:
plot_images(images, labels)

Resize the images¶

As the size of the images is large, it may be computationally expensive to train on these larger images; therefore, it is preferable to reduce the image size from 128 to 64.

In [26]:
images_decreased = []
height = 64
width = 64
dimensions = (width, height)
for i in range(len(images)):
  images_decreased.append(cv2.resize(images[i], dimensions, interpolation=cv2.INTER_LINEAR))
In [27]:
plt.imshow(images[10])
Out[27]:
<matplotlib.image.AxesImage at 0x7e0e2b3a3700>
In [28]:
plt.imshow(images_decreased[10])
Out[28]:
<matplotlib.image.AxesImage at 0x7e0e293770d0>

The reduced images are slightly pixelated, will try a Gaussian Blur

In [29]:
images_gb = []
for i in range(len(images_decreased)):
  images_gb.append(cv2.GaussianBlur(images_decreased[i], ksize=(3,3), sigmaX=0))
In [30]:
plt.imshow(images_gb[10])
Out[30]:
<matplotlib.image.AxesImage at 0x7e0e293f5f60>

The Gaussian Blur has made the images less distinct, will use the images_decreased set for the initial model training.

Data Preparation for Modeling¶

  • Before you proceed to build a model, you need to split the data into train, test, and validation to be able to evaluate the model that you build on the train data
  • You'll have to encode categorical features and scale the pixel values.
  • You will build a model using the train data and then check its performance

Split the dataset

In [31]:
X_temp, X_test, y_temp, y_test = train_test_split(np.array(images_decreased), labels, test_size=0.1, random_state=1, stratify=labels)
In [32]:
X_train, X_val, y_train, y_val = train_test_split(X_temp, y_temp, test_size = 0.1, random_state=1, stratify = y_temp)
In [33]:
print("X_train:", X_train.shape)
print("y_train:", y_train.shape)
print("X_val:", X_val.shape)
print("y_val:", y_val.shape)
print("X_test:", X_test.shape)
print("y_test:", y_test.shape)
X_train: (3847, 64, 64, 3)
y_train: (3847, 1)
X_val: (428, 64, 64, 3)
y_val: (428, 1)
X_test: (475, 64, 64, 3)
y_test: (475, 1)

Encode the target labels¶

In [35]:
from sklearn.preprocessing import LabelBinarizer
In [36]:
# convert the labels from names to OHE vectors
enc = LabelBinarizer()
y_train_encoded = enc.fit_transform(y_train)
y_val_encoded = enc.transform(y_val)
y_test_encoded = enc.transform(y_test)

Data Normalization¶

In [86]:
# normalize the pixel values, convert from 0-255 to 0-1
X_train_normalized = X_train.astype('float32')/255.0
X_val_normalized = X_val.astype('float32')/255.0
X_test_normalized = X_test.astype('float32')/255.0

Model Building¶

In [40]:
from tensorflow.keras import backend
In [43]:
# clearing the backend and setting the seeds
backend.clear_session()

np.random.seed(1)
random.seed(1)
tf.random.set_seed(1)

The first, baseline model, will be a CNN without batch normalization or dropout. Depending on performance, will add these layers as necessary.

In [44]:
# initialize the model as sequential
model1 = Sequential()
In [45]:
# start with a conv layer
model1.add(Conv2D(64, (3,3), activation='relu', padding='same', input_shape=(64,64,3)))
model1.add(MaxPooling2D(2,2))
model1.add(Conv2D(32, (3,3), activation='relu', padding='same'))
model1.add(MaxPooling2D(2,2))
model1.add(Conv2D(16, (3,3), activation='relu', padding='same'))
model1.add(MaxPooling2D(2,2))

# flatten
model1.add(Flatten())

# ANN layers
model1.add(Dense(64, activation='relu'))
model1.add(Dense(32, activation='relu'))
model1.add(Dense(16, activation='relu'))

#output layer
model1.add(Dense(12, activation='softmax'))
In [46]:
model1.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model1.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                         ┃ Output Shape                ┃         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ conv2d (Conv2D)                      │ (None, 64, 64, 64)          │           1,792 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d (MaxPooling2D)         │ (None, 32, 32, 64)          │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_1 (Conv2D)                    │ (None, 32, 32, 32)          │          18,464 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d_1 (MaxPooling2D)       │ (None, 16, 16, 32)          │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_2 (Conv2D)                    │ (None, 16, 16, 16)          │           4,624 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d_2 (MaxPooling2D)       │ (None, 8, 8, 16)            │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ flatten (Flatten)                    │ (None, 1024)                │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense (Dense)                        │ (None, 64)                  │          65,600 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense)                      │ (None, 32)                  │           2,080 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_2 (Dense)                      │ (None, 16)                  │             528 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_3 (Dense)                      │ (None, 12)                  │             204 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 93,292 (364.42 KB)
 Trainable params: 93,292 (364.42 KB)
 Non-trainable params: 0 (0.00 B)
In [47]:
history1 = model1.fit(X_train_normalized, y_train_encoded,
                    validation_data=(X_val_normalized, y_val_encoded),
                    epochs=10,
                    batch_size=32,
                    verbose=2
)
Epoch 1/10
121/121 - 20s - 167ms/step - accuracy: 0.1302 - loss: 2.4296 - val_accuracy: 0.2290 - val_loss: 2.2644
Epoch 2/10
121/121 - 6s - 52ms/step - accuracy: 0.3447 - loss: 1.9844 - val_accuracy: 0.3925 - val_loss: 1.8555
Epoch 3/10
121/121 - 1s - 6ms/step - accuracy: 0.4962 - loss: 1.4952 - val_accuracy: 0.5467 - val_loss: 1.4242
Epoch 4/10
121/121 - 1s - 6ms/step - accuracy: 0.5792 - loss: 1.1983 - val_accuracy: 0.6192 - val_loss: 1.2461
Epoch 5/10
121/121 - 1s - 10ms/step - accuracy: 0.6395 - loss: 1.0278 - val_accuracy: 0.6636 - val_loss: 1.1240
Epoch 6/10
121/121 - 1s - 11ms/step - accuracy: 0.6818 - loss: 0.9202 - val_accuracy: 0.6752 - val_loss: 1.0857
Epoch 7/10
121/121 - 1s - 9ms/step - accuracy: 0.7143 - loss: 0.8323 - val_accuracy: 0.6729 - val_loss: 1.0618
Epoch 8/10
121/121 - 1s - 6ms/step - accuracy: 0.7369 - loss: 0.7623 - val_accuracy: 0.6776 - val_loss: 1.0676
Epoch 9/10
121/121 - 1s - 5ms/step - accuracy: 0.7614 - loss: 0.6867 - val_accuracy: 0.6986 - val_loss: 1.0125
Epoch 10/10
121/121 - 1s - 6ms/step - accuracy: 0.7801 - loss: 0.6320 - val_accuracy: 0.7056 - val_loss: 0.9939
In [48]:
plt.plot(history1.history['accuracy'])
plt.plot(history1.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()
In [49]:
accuracy = model1.evaluate(X_test_normalized, y_test_encoded, verbose=2)
15/15 - 1s - 59ms/step - accuracy: 0.7116 - loss: 0.8807
In [50]:
y_pred1 = model1.predict(X_test_normalized)
# Obtaining the categorical values from y_test_encoded and y_pred
y_pred1_arg=np.argmax(y_pred1,axis=1)
y_test_arg=np.argmax(y_test_encoded,axis=1)

# Plotting the Confusion Matrix using confusion matrix() function which is also predefined tensorflow module
confusion_matrix = tf.math.confusion_matrix(y_test_arg,y_pred1_arg)
f, ax = plt.subplots(figsize=(10, 8))
sns.heatmap(
    confusion_matrix,
    annot=True,
    linewidths=.4,
    fmt="d",
    square=True,
    ax=ax
)
plt.show()
15/15 ━━━━━━━━━━━━━━━━━━━━ 1s 22ms/step
In [51]:
cr1 =metrics.classification_report(y_test_arg,y_pred1_arg, output_dict=True)
f1_1 = cr1['macro avg']['f1-score']
acc_1 = cr1['accuracy']
print('f1-score:',f1_1)
print('Accuracy:', acc_1)
f1-score: 0.6371056386785358
Accuracy: 0.7115789473684211

The initial model trained nicely, it is a little over fit above 5 epochs, but accuracy is not great (71%).

Will try some Batch Norm and Dropout layers

In [64]:
# clearing the backend and setting the seeds
backend.clear_session()

np.random.seed(1)
random.seed(1)
tf.random.set_seed(1)
In [65]:
# initialize the model as sequential
model2 = Sequential()
In [66]:
# start with a conv layer
model2.add(Conv2D(64, (3,3), activation='relu', padding='same', input_shape=(64,64,3)))
model2.add(MaxPooling2D(2,2))
model2.add(BatchNormalization())
model2.add(Conv2D(32, (3,3), activation='relu', padding='same'))
model2.add(MaxPooling2D(2,2))
model2.add(BatchNormalization())
model2.add(Conv2D(16, (3,3), activation='relu', padding='same'))
model2.add(MaxPooling2D(2,2))
model2.add(BatchNormalization())

# flatten
model2.add(Flatten())

# ANN layers
model2.add(Dense(64, activation='relu'))
model2.add(Dropout((0.25)))
model2.add(Dense(32, activation='relu'))
model2.add(Dropout((0.25)))
model2.add(Dense(16, activation='relu'))

#output layer
model2.add(Dense(12, activation='softmax'))
In [67]:
model2.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model2.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                         ┃ Output Shape                ┃         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ conv2d (Conv2D)                      │ (None, 64, 64, 64)          │           1,792 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d (MaxPooling2D)         │ (None, 32, 32, 64)          │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ batch_normalization                  │ (None, 32, 32, 64)          │             256 │
│ (BatchNormalization)                 │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_1 (Conv2D)                    │ (None, 32, 32, 32)          │          18,464 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d_1 (MaxPooling2D)       │ (None, 16, 16, 32)          │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ batch_normalization_1                │ (None, 16, 16, 32)          │             128 │
│ (BatchNormalization)                 │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_2 (Conv2D)                    │ (None, 16, 16, 16)          │           4,624 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d_2 (MaxPooling2D)       │ (None, 8, 8, 16)            │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ batch_normalization_2                │ (None, 8, 8, 16)            │              64 │
│ (BatchNormalization)                 │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ flatten (Flatten)                    │ (None, 1024)                │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense (Dense)                        │ (None, 64)                  │          65,600 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout (Dropout)                    │ (None, 64)                  │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense)                      │ (None, 32)                  │           2,080 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout_1 (Dropout)                  │ (None, 32)                  │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_2 (Dense)                      │ (None, 16)                  │             528 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_3 (Dense)                      │ (None, 12)                  │             204 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 93,740 (366.17 KB)
 Trainable params: 93,516 (365.30 KB)
 Non-trainable params: 224 (896.00 B)
In [68]:
history2 = model2.fit(X_train_normalized, y_train_encoded,
                    validation_data=(X_val_normalized, y_val_encoded),
                    epochs=50,
                    batch_size=32,
                    verbose=2,
)
Epoch 1/50
121/121 - 22s - 179ms/step - accuracy: 0.2173 - loss: 2.2867 - val_accuracy: 0.1379 - val_loss: 3.7254
Epoch 2/50
121/121 - 6s - 47ms/step - accuracy: 0.3787 - loss: 1.8152 - val_accuracy: 0.1379 - val_loss: 4.6051
Epoch 3/50
121/121 - 2s - 19ms/step - accuracy: 0.4856 - loss: 1.5052 - val_accuracy: 0.1542 - val_loss: 3.4327
Epoch 4/50
121/121 - 1s - 12ms/step - accuracy: 0.5586 - loss: 1.2826 - val_accuracy: 0.3318 - val_loss: 2.2043
Epoch 5/50
121/121 - 2s - 19ms/step - accuracy: 0.6356 - loss: 1.0843 - val_accuracy: 0.1799 - val_loss: 4.0104
Epoch 6/50
121/121 - 1s - 11ms/step - accuracy: 0.6782 - loss: 0.9494 - val_accuracy: 0.3528 - val_loss: 2.5110
Epoch 7/50
121/121 - 2s - 14ms/step - accuracy: 0.7211 - loss: 0.8234 - val_accuracy: 0.6986 - val_loss: 0.8903
Epoch 8/50
121/121 - 2s - 15ms/step - accuracy: 0.7468 - loss: 0.7604 - val_accuracy: 0.6425 - val_loss: 1.0698
Epoch 9/50
121/121 - 2s - 14ms/step - accuracy: 0.7694 - loss: 0.6810 - val_accuracy: 0.6799 - val_loss: 0.9886
Epoch 10/50
121/121 - 2s - 20ms/step - accuracy: 0.7944 - loss: 0.6049 - val_accuracy: 0.6893 - val_loss: 1.0192
Epoch 11/50
121/121 - 1s - 8ms/step - accuracy: 0.8123 - loss: 0.5504 - val_accuracy: 0.5607 - val_loss: 1.7544
Epoch 12/50
121/121 - 1s - 7ms/step - accuracy: 0.8326 - loss: 0.4841 - val_accuracy: 0.7033 - val_loss: 1.0513
Epoch 13/50
121/121 - 1s - 7ms/step - accuracy: 0.8310 - loss: 0.4842 - val_accuracy: 0.6472 - val_loss: 1.3727
Epoch 14/50
121/121 - 1s - 10ms/step - accuracy: 0.8544 - loss: 0.4188 - val_accuracy: 0.7033 - val_loss: 0.9267
Epoch 15/50
121/121 - 1s - 6ms/step - accuracy: 0.8664 - loss: 0.3942 - val_accuracy: 0.7827 - val_loss: 0.7638
Epoch 16/50
121/121 - 1s - 11ms/step - accuracy: 0.8700 - loss: 0.3653 - val_accuracy: 0.7266 - val_loss: 0.9704
Epoch 17/50
121/121 - 1s - 10ms/step - accuracy: 0.8791 - loss: 0.3446 - val_accuracy: 0.7967 - val_loss: 0.6708
Epoch 18/50
121/121 - 1s - 11ms/step - accuracy: 0.8825 - loss: 0.3238 - val_accuracy: 0.7944 - val_loss: 0.8385
Epoch 19/50
121/121 - 1s - 10ms/step - accuracy: 0.8908 - loss: 0.3072 - val_accuracy: 0.7827 - val_loss: 0.9162
Epoch 20/50
121/121 - 1s - 7ms/step - accuracy: 0.8963 - loss: 0.2953 - val_accuracy: 0.7079 - val_loss: 1.1427
Epoch 21/50
121/121 - 1s - 6ms/step - accuracy: 0.9093 - loss: 0.2710 - val_accuracy: 0.7523 - val_loss: 0.8026
Epoch 22/50
121/121 - 1s - 12ms/step - accuracy: 0.9023 - loss: 0.2677 - val_accuracy: 0.5794 - val_loss: 1.7289
Epoch 23/50
121/121 - 1s - 8ms/step - accuracy: 0.9023 - loss: 0.2662 - val_accuracy: 0.6495 - val_loss: 1.5175
Epoch 24/50
121/121 - 1s - 8ms/step - accuracy: 0.9077 - loss: 0.2521 - val_accuracy: 0.7430 - val_loss: 1.3299
Epoch 25/50
121/121 - 1s - 9ms/step - accuracy: 0.9150 - loss: 0.2458 - val_accuracy: 0.7570 - val_loss: 1.2151
Epoch 26/50
121/121 - 1s - 6ms/step - accuracy: 0.9218 - loss: 0.2190 - val_accuracy: 0.6355 - val_loss: 1.6959
Epoch 27/50
121/121 - 1s - 6ms/step - accuracy: 0.9241 - loss: 0.2066 - val_accuracy: 0.6869 - val_loss: 1.3032
Epoch 28/50
121/121 - 1s - 7ms/step - accuracy: 0.9176 - loss: 0.2309 - val_accuracy: 0.8014 - val_loss: 0.8495
Epoch 29/50
121/121 - 1s - 11ms/step - accuracy: 0.9267 - loss: 0.2000 - val_accuracy: 0.7360 - val_loss: 1.4978
Epoch 30/50
121/121 - 1s - 7ms/step - accuracy: 0.9324 - loss: 0.1928 - val_accuracy: 0.7967 - val_loss: 0.8197
Epoch 31/50
121/121 - 1s - 10ms/step - accuracy: 0.9350 - loss: 0.1752 - val_accuracy: 0.7734 - val_loss: 1.0270
Epoch 32/50
121/121 - 1s - 10ms/step - accuracy: 0.9301 - loss: 0.2129 - val_accuracy: 0.7874 - val_loss: 0.9036
Epoch 33/50
121/121 - 1s - 6ms/step - accuracy: 0.9355 - loss: 0.1863 - val_accuracy: 0.7991 - val_loss: 0.9385
Epoch 34/50
121/121 - 1s - 11ms/step - accuracy: 0.9426 - loss: 0.1708 - val_accuracy: 0.7126 - val_loss: 1.4379
Epoch 35/50
121/121 - 1s - 7ms/step - accuracy: 0.9478 - loss: 0.1541 - val_accuracy: 0.6636 - val_loss: 1.5550
Epoch 36/50
121/121 - 1s - 11ms/step - accuracy: 0.9478 - loss: 0.1532 - val_accuracy: 0.7804 - val_loss: 1.1972
Epoch 37/50
121/121 - 1s - 8ms/step - accuracy: 0.9532 - loss: 0.1432 - val_accuracy: 0.4065 - val_loss: 4.5059
Epoch 38/50
121/121 - 1s - 9ms/step - accuracy: 0.9387 - loss: 0.1770 - val_accuracy: 0.7757 - val_loss: 1.1923
Epoch 39/50
121/121 - 1s - 10ms/step - accuracy: 0.9524 - loss: 0.1359 - val_accuracy: 0.7173 - val_loss: 1.6905
Epoch 40/50
121/121 - 1s - 7ms/step - accuracy: 0.9462 - loss: 0.1494 - val_accuracy: 0.7547 - val_loss: 1.4195
Epoch 41/50
121/121 - 1s - 6ms/step - accuracy: 0.9550 - loss: 0.1264 - val_accuracy: 0.7967 - val_loss: 0.9065
Epoch 42/50
121/121 - 1s - 7ms/step - accuracy: 0.9607 - loss: 0.1111 - val_accuracy: 0.7266 - val_loss: 1.4497
Epoch 43/50
121/121 - 1s - 10ms/step - accuracy: 0.9496 - loss: 0.1475 - val_accuracy: 0.7383 - val_loss: 1.2429
Epoch 44/50
121/121 - 1s - 8ms/step - accuracy: 0.9581 - loss: 0.1365 - val_accuracy: 0.7664 - val_loss: 1.0403
Epoch 45/50
121/121 - 1s - 11ms/step - accuracy: 0.9641 - loss: 0.1125 - val_accuracy: 0.8224 - val_loss: 0.8561
Epoch 46/50
121/121 - 1s - 10ms/step - accuracy: 0.9623 - loss: 0.1139 - val_accuracy: 0.7079 - val_loss: 1.5457
Epoch 47/50
121/121 - 1s - 8ms/step - accuracy: 0.9691 - loss: 0.1040 - val_accuracy: 0.7453 - val_loss: 1.3265
Epoch 48/50
121/121 - 3s - 21ms/step - accuracy: 0.9659 - loss: 0.1073 - val_accuracy: 0.7243 - val_loss: 1.2340
Epoch 49/50
121/121 - 2s - 13ms/step - accuracy: 0.9613 - loss: 0.1120 - val_accuracy: 0.6846 - val_loss: 1.6820
Epoch 50/50
121/121 - 2s - 17ms/step - accuracy: 0.9633 - loss: 0.1127 - val_accuracy: 0.6379 - val_loss: 2.3727
In [70]:
plt.plot(history2.history['accuracy'])
plt.plot(history2.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()
In [71]:
accuracy = model2.evaluate(X_test_normalized, y_test_encoded, verbose=2)
15/15 - 1s - 48ms/step - accuracy: 0.6463 - loss: 2.3828
In [72]:
y_pred2 = model2.predict(X_test_normalized)
# Obtaining the categorical values from y_test_encoded and y_pred
y_pred2_arg=np.argmax(y_pred2,axis=1)
y_test_arg=np.argmax(y_test_encoded,axis=1)

# Plotting the Confusion Matrix using confusion matrix() function which is also predefined tensorflow module
confusion_matrix = tf.math.confusion_matrix(y_test_arg,y_pred2_arg)
f, ax = plt.subplots(figsize=(10, 8))
sns.heatmap(
    confusion_matrix,
    annot=True,
    linewidths=.4,
    fmt="d",
    square=True,
    ax=ax
)
plt.show()
15/15 ━━━━━━━━━━━━━━━━━━━━ 1s 25ms/step
In [73]:
cr2 =metrics.classification_report(y_test_arg,y_pred2_arg, output_dict=True)
f1_2 = cr2['macro avg']['f1-score']
acc_2 = cr2['accuracy']
print('f1-score:',f1_2)
print('Accuracy:', acc_2)
f1-score: 0.6178301966668281
Accuracy: 0.6463157894736842

Adding Batch Normalization, Dropout, the accuracy improved to about 64%.

There was very little improvement after the 10th epoch

In [74]:
# clearing the backend and setting the seeds
backend.clear_session()

np.random.seed(1)
random.seed(1)
tf.random.set_seed(1)
In [75]:
# initialize the model as sequential
model3 = Sequential()
In [76]:
# start with a conv layer
model3.add(Conv2D(64, (3,3), activation='relu', padding='same', input_shape=(64,64,3)))
model3.add(MaxPooling2D(2,2))
model3.add(BatchNormalization())
model3.add(Conv2D(32, (3,3), activation='relu', padding='same'))
model3.add(MaxPooling2D(2,2))
model3.add(BatchNormalization())
model3.add(Conv2D(16, (3,3), activation='relu', padding='same'))
model3.add(MaxPooling2D(2,2))
model3.add(BatchNormalization())

# flatten
model3.add(Flatten())

# ANN layers
model3.add(Dense(64, activation='relu'))
model3.add(Dropout((0.25)))
model3.add(BatchNormalization())
model3.add(Dense(32, activation='relu'))
model3.add(Dropout((0.25)))
model3.add(BatchNormalization())
model3.add(Dense(32, activation='relu'))
model3.add(Dropout((0.25)))
model3.add(BatchNormalization())
model3.add(Dense(16, activation='relu'))
model3.add(BatchNormalization())
#output layer
model3.add(Dense(12, activation='softmax'))
In [77]:
model3.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model3.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                         ┃ Output Shape                ┃         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ conv2d (Conv2D)                      │ (None, 64, 64, 64)          │           1,792 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d (MaxPooling2D)         │ (None, 32, 32, 64)          │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ batch_normalization                  │ (None, 32, 32, 64)          │             256 │
│ (BatchNormalization)                 │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_1 (Conv2D)                    │ (None, 32, 32, 32)          │          18,464 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d_1 (MaxPooling2D)       │ (None, 16, 16, 32)          │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ batch_normalization_1                │ (None, 16, 16, 32)          │             128 │
│ (BatchNormalization)                 │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_2 (Conv2D)                    │ (None, 16, 16, 16)          │           4,624 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d_2 (MaxPooling2D)       │ (None, 8, 8, 16)            │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ batch_normalization_2                │ (None, 8, 8, 16)            │              64 │
│ (BatchNormalization)                 │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ flatten (Flatten)                    │ (None, 1024)                │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense (Dense)                        │ (None, 64)                  │          65,600 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout (Dropout)                    │ (None, 64)                  │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ batch_normalization_3                │ (None, 64)                  │             256 │
│ (BatchNormalization)                 │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense)                      │ (None, 32)                  │           2,080 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout_1 (Dropout)                  │ (None, 32)                  │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ batch_normalization_4                │ (None, 32)                  │             128 │
│ (BatchNormalization)                 │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_2 (Dense)                      │ (None, 32)                  │           1,056 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout_2 (Dropout)                  │ (None, 32)                  │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ batch_normalization_5                │ (None, 32)                  │             128 │
│ (BatchNormalization)                 │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_3 (Dense)                      │ (None, 16)                  │             528 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ batch_normalization_6                │ (None, 16)                  │              64 │
│ (BatchNormalization)                 │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_4 (Dense)                      │ (None, 12)                  │             204 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 95,372 (372.55 KB)
 Trainable params: 94,860 (370.55 KB)
 Non-trainable params: 512 (2.00 KB)
In [78]:
history3 = model3.fit(X_train_normalized, y_train_encoded,
                    validation_data=(X_val_normalized, y_val_encoded),
                    epochs=50,
                    batch_size=32,
                    verbose=2,
                    )
Epoch 1/50
121/121 - 29s - 236ms/step - accuracy: 0.1240 - loss: 2.6493 - val_accuracy: 0.1308 - val_loss: 2.4563
Epoch 2/50
121/121 - 1s - 9ms/step - accuracy: 0.2649 - loss: 2.1672 - val_accuracy: 0.1379 - val_loss: 2.6308
Epoch 3/50
121/121 - 2s - 14ms/step - accuracy: 0.3501 - loss: 1.8899 - val_accuracy: 0.3458 - val_loss: 2.1038
Epoch 4/50
121/121 - 1s - 11ms/step - accuracy: 0.4229 - loss: 1.7097 - val_accuracy: 0.3107 - val_loss: 2.0537
Epoch 5/50
121/121 - 1s - 11ms/step - accuracy: 0.4705 - loss: 1.5767 - val_accuracy: 0.4463 - val_loss: 1.6298
Epoch 6/50
121/121 - 3s - 24ms/step - accuracy: 0.5199 - loss: 1.4549 - val_accuracy: 0.4696 - val_loss: 1.5410
Epoch 7/50
121/121 - 1s - 11ms/step - accuracy: 0.5061 - loss: 1.4446 - val_accuracy: 0.3294 - val_loss: 1.9456
Epoch 8/50
121/121 - 2s - 13ms/step - accuracy: 0.5482 - loss: 1.3225 - val_accuracy: 0.5888 - val_loss: 1.2001
Epoch 9/50
121/121 - 1s - 8ms/step - accuracy: 0.5849 - loss: 1.2479 - val_accuracy: 0.2687 - val_loss: 2.1845
Epoch 10/50
121/121 - 1s - 9ms/step - accuracy: 0.6135 - loss: 1.1597 - val_accuracy: 0.3248 - val_loss: 2.1010
Epoch 11/50
121/121 - 1s - 8ms/step - accuracy: 0.6356 - loss: 1.0995 - val_accuracy: 0.2804 - val_loss: 2.2826
Epoch 12/50
121/121 - 1s - 7ms/step - accuracy: 0.6571 - loss: 1.0426 - val_accuracy: 0.5350 - val_loss: 1.4390
Epoch 13/50
121/121 - 1s - 7ms/step - accuracy: 0.6657 - loss: 0.9875 - val_accuracy: 0.4486 - val_loss: 1.6766
Epoch 14/50
121/121 - 1s - 7ms/step - accuracy: 0.6925 - loss: 0.9387 - val_accuracy: 0.5911 - val_loss: 1.2888
Epoch 15/50
121/121 - 1s - 7ms/step - accuracy: 0.7265 - loss: 0.8683 - val_accuracy: 0.7547 - val_loss: 0.7910
Epoch 16/50
121/121 - 1s - 7ms/step - accuracy: 0.7156 - loss: 0.8558 - val_accuracy: 0.6402 - val_loss: 1.0545
Epoch 17/50
121/121 - 1s - 11ms/step - accuracy: 0.7351 - loss: 0.8112 - val_accuracy: 0.6682 - val_loss: 1.0018
Epoch 18/50
121/121 - 1s - 10ms/step - accuracy: 0.7440 - loss: 0.7755 - val_accuracy: 0.2967 - val_loss: 2.3481
Epoch 19/50
121/121 - 1s - 12ms/step - accuracy: 0.7570 - loss: 0.7308 - val_accuracy: 0.7290 - val_loss: 0.9314
Epoch 20/50
121/121 - 1s - 8ms/step - accuracy: 0.7710 - loss: 0.6953 - val_accuracy: 0.7313 - val_loss: 0.8393
Epoch 21/50
121/121 - 1s - 11ms/step - accuracy: 0.7632 - loss: 0.7087 - val_accuracy: 0.4907 - val_loss: 1.7698
Epoch 22/50
121/121 - 1s - 7ms/step - accuracy: 0.7879 - loss: 0.6439 - val_accuracy: 0.7804 - val_loss: 0.6739
Epoch 23/50
121/121 - 1s - 7ms/step - accuracy: 0.7884 - loss: 0.6406 - val_accuracy: 0.7710 - val_loss: 0.7248
Epoch 24/50
121/121 - 1s - 7ms/step - accuracy: 0.7848 - loss: 0.6316 - val_accuracy: 0.6822 - val_loss: 0.9309
Epoch 25/50
121/121 - 1s - 7ms/step - accuracy: 0.8024 - loss: 0.5898 - val_accuracy: 0.8435 - val_loss: 0.5488
Epoch 26/50
121/121 - 1s - 10ms/step - accuracy: 0.8113 - loss: 0.5823 - val_accuracy: 0.6916 - val_loss: 0.9216
Epoch 27/50
121/121 - 1s - 11ms/step - accuracy: 0.8053 - loss: 0.5882 - val_accuracy: 0.5958 - val_loss: 1.2418
Epoch 28/50
121/121 - 1s - 7ms/step - accuracy: 0.7949 - loss: 0.6169 - val_accuracy: 0.7033 - val_loss: 1.0228
Epoch 29/50
121/121 - 1s - 10ms/step - accuracy: 0.8217 - loss: 0.5301 - val_accuracy: 0.7734 - val_loss: 0.8080
Epoch 30/50
121/121 - 1s - 10ms/step - accuracy: 0.8321 - loss: 0.5025 - val_accuracy: 0.8248 - val_loss: 0.5851
Epoch 31/50
121/121 - 1s - 7ms/step - accuracy: 0.8336 - loss: 0.4988 - val_accuracy: 0.8107 - val_loss: 0.6298
Epoch 32/50
121/121 - 1s - 12ms/step - accuracy: 0.8404 - loss: 0.4664 - val_accuracy: 0.6565 - val_loss: 1.1940
Epoch 33/50
121/121 - 1s - 10ms/step - accuracy: 0.8477 - loss: 0.4609 - val_accuracy: 0.7266 - val_loss: 0.9404
Epoch 34/50
121/121 - 1s - 8ms/step - accuracy: 0.8485 - loss: 0.4615 - val_accuracy: 0.8061 - val_loss: 0.6808
Epoch 35/50
121/121 - 1s - 9ms/step - accuracy: 0.8472 - loss: 0.4460 - val_accuracy: 0.7897 - val_loss: 0.7690
Epoch 36/50
121/121 - 1s - 7ms/step - accuracy: 0.8635 - loss: 0.4059 - val_accuracy: 0.5093 - val_loss: 2.0239
Epoch 37/50
121/121 - 1s - 7ms/step - accuracy: 0.8640 - loss: 0.4108 - val_accuracy: 0.6238 - val_loss: 1.4294
Epoch 38/50
121/121 - 1s - 7ms/step - accuracy: 0.8625 - loss: 0.4102 - val_accuracy: 0.7967 - val_loss: 0.7054
Epoch 39/50
121/121 - 1s - 11ms/step - accuracy: 0.8531 - loss: 0.4261 - val_accuracy: 0.7079 - val_loss: 1.0375
Epoch 40/50
121/121 - 1s - 7ms/step - accuracy: 0.8664 - loss: 0.4045 - val_accuracy: 0.7991 - val_loss: 0.7962
Epoch 41/50
121/121 - 1s - 7ms/step - accuracy: 0.8703 - loss: 0.3755 - val_accuracy: 0.7430 - val_loss: 0.9819
Epoch 42/50
121/121 - 1s - 10ms/step - accuracy: 0.8427 - loss: 0.4784 - val_accuracy: 0.7477 - val_loss: 0.9589
Epoch 43/50
121/121 - 1s - 10ms/step - accuracy: 0.8731 - loss: 0.3681 - val_accuracy: 0.7617 - val_loss: 0.9005
Epoch 44/50
121/121 - 1s - 11ms/step - accuracy: 0.8552 - loss: 0.4305 - val_accuracy: 0.7103 - val_loss: 1.0893
Epoch 45/50
121/121 - 1s - 11ms/step - accuracy: 0.8776 - loss: 0.3595 - val_accuracy: 0.7150 - val_loss: 1.1349
Epoch 46/50
121/121 - 1s - 11ms/step - accuracy: 0.8791 - loss: 0.3499 - val_accuracy: 0.7477 - val_loss: 0.9106
Epoch 47/50
121/121 - 1s - 8ms/step - accuracy: 0.8929 - loss: 0.3246 - val_accuracy: 0.7313 - val_loss: 1.1405
Epoch 48/50
121/121 - 1s - 7ms/step - accuracy: 0.8932 - loss: 0.3143 - val_accuracy: 0.7617 - val_loss: 0.9238
Epoch 49/50
121/121 - 1s - 7ms/step - accuracy: 0.8851 - loss: 0.3388 - val_accuracy: 0.7874 - val_loss: 0.8312
Epoch 50/50
121/121 - 1s - 7ms/step - accuracy: 0.8555 - loss: 0.4385 - val_accuracy: 0.1472 - val_loss: 8.1950
In [79]:
plt.plot(history3.history['accuracy'])
plt.plot(history3.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()
In [80]:
accuracy = model3.evaluate(X_test_normalized, y_test_encoded, verbose=2)
15/15 - 1s - 65ms/step - accuracy: 0.1368 - loss: 8.2532
In [81]:
y_pred3 = model3.predict(X_test_normalized)
# Obtaining the categorical values from y_test_encoded and y_pred
y_pred3_arg=np.argmax(y_pred3,axis=1)
y_test_arg=np.argmax(y_test_encoded,axis=1)

# Plotting the Confusion Matrix using confusion matrix() function which is also predefined tensorflow module
confusion_matrix = tf.math.confusion_matrix(y_test_arg,y_pred3_arg)
f, ax = plt.subplots(figsize=(10, 8))
sns.heatmap(
    confusion_matrix,
    annot=True,
    linewidths=.4,
    fmt="d",
    square=True,
    ax=ax
)
plt.show()
15/15 ━━━━━━━━━━━━━━━━━━━━ 1s 37ms/step
In [82]:
cr3 =metrics.classification_report(y_test_arg,y_pred3_arg, output_dict=True)
f1_3 = cr3['macro avg']['f1-score']
acc_3 = cr3['accuracy']
print('f1-score:',f1_3)
print('Accuracy:', acc_3)
f1-score: 0.08424386742111484
Accuracy: 0.1368421052631579

Model Performance Improvement¶

Reducing the Learning Rate:

Hint: Use ReduceLRonPlateau() function that will be used to decrease the learning rate by some factor, if the loss is not decreasing for some time. This may start decreasing the loss at a smaller learning rate. There is a possibility that the loss may still not decrease. This may lead to executing the learning rate reduction again in an attempt to achieve a lower loss.

Data Augmentation¶

Remember, data augmentation should not be used in the validation/test data set.

In [96]:
# clearing the backend and setting the seeds
backend.clear_session()

np.random.seed(1)
random.seed(1)
tf.random.set_seed(1)
In [97]:
# initialize the model as sequential
model4 = Sequential()
In [98]:
# start with a conv layer
model4.add(Conv2D(64, (3,3), activation='relu', padding='same', input_shape=(64,64,3)))
model4.add(MaxPooling2D(2,2))
model4.add(BatchNormalization())
model4.add(Conv2D(32, (3,3), activation='relu', padding='same'))
model4.add(MaxPooling2D(2,2))
model4.add(BatchNormalization())
model4.add(Conv2D(16, (3,3), activation='relu', padding='same'))
model4.add(MaxPooling2D(2,2))
model4.add(BatchNormalization())

# flatten
model4.add(Flatten())

# ANN layers
model4.add(Dense(64, activation='relu'))
model4.add(Dropout((0.25)))
model4.add(Dense(32, activation='relu'))
model4.add(Dropout((0.25)))
model4.add(Dense(16, activation='relu'))

#output layer
model4.add(Dense(12, activation='softmax'))
In [99]:
model4.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model4.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                         ┃ Output Shape                ┃         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ conv2d (Conv2D)                      │ (None, 64, 64, 64)          │           1,792 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d (MaxPooling2D)         │ (None, 32, 32, 64)          │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ batch_normalization                  │ (None, 32, 32, 64)          │             256 │
│ (BatchNormalization)                 │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_1 (Conv2D)                    │ (None, 32, 32, 32)          │          18,464 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d_1 (MaxPooling2D)       │ (None, 16, 16, 32)          │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ batch_normalization_1                │ (None, 16, 16, 32)          │             128 │
│ (BatchNormalization)                 │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_2 (Conv2D)                    │ (None, 16, 16, 16)          │           4,624 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d_2 (MaxPooling2D)       │ (None, 8, 8, 16)            │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ batch_normalization_2                │ (None, 8, 8, 16)            │              64 │
│ (BatchNormalization)                 │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ flatten (Flatten)                    │ (None, 1024)                │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense (Dense)                        │ (None, 64)                  │          65,600 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout (Dropout)                    │ (None, 64)                  │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense)                      │ (None, 32)                  │           2,080 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout_1 (Dropout)                  │ (None, 32)                  │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_2 (Dense)                      │ (None, 16)                  │             528 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_3 (Dense)                      │ (None, 12)                  │             204 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 93,740 (366.17 KB)
 Trainable params: 93,516 (365.30 KB)
 Non-trainable params: 224 (896.00 B)
In [100]:
early_stop = EarlyStopping(monitor='val_loss', min_delta=0.00001, patience=5,) #restore_best_weights=True)
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=2, mode='auto', min_delta=0.00001)
callbacks = (early_stop, reduce_lr)

history4 = model4.fit(X_train_normalized, y_train_encoded,
                    validation_data=(X_val_normalized, y_val_encoded),
                    epochs=50,
                    batch_size=32,
                    verbose=2,
                    callbacks=callbacks
)
Epoch 1/50
121/121 - 12s - 97ms/step - accuracy: 0.2184 - loss: 2.2903 - val_accuracy: 0.1379 - val_loss: 3.0072 - learning_rate: 0.0010
Epoch 2/50
121/121 - 1s - 7ms/step - accuracy: 0.3811 - loss: 1.8341 - val_accuracy: 0.1379 - val_loss: 4.0873 - learning_rate: 0.0010
Epoch 3/50
121/121 - 1s - 6ms/step - accuracy: 0.4754 - loss: 1.5009 - val_accuracy: 0.1379 - val_loss: 4.6065 - learning_rate: 0.0010
Epoch 4/50
121/121 - 1s - 6ms/step - accuracy: 0.5659 - loss: 1.2532 - val_accuracy: 0.1612 - val_loss: 3.5887 - learning_rate: 5.0000e-04
Epoch 5/50
121/121 - 1s - 10ms/step - accuracy: 0.6239 - loss: 1.0975 - val_accuracy: 0.2827 - val_loss: 2.1923 - learning_rate: 5.0000e-04
Epoch 6/50
121/121 - 1s - 10ms/step - accuracy: 0.6610 - loss: 0.9958 - val_accuracy: 0.6285 - val_loss: 1.0835 - learning_rate: 5.0000e-04
Epoch 7/50
121/121 - 1s - 6ms/step - accuracy: 0.6831 - loss: 0.8971 - val_accuracy: 0.6776 - val_loss: 1.0303 - learning_rate: 5.0000e-04
Epoch 8/50
121/121 - 1s - 6ms/step - accuracy: 0.7141 - loss: 0.8339 - val_accuracy: 0.7150 - val_loss: 0.8715 - learning_rate: 5.0000e-04
Epoch 9/50
121/121 - 1s - 6ms/step - accuracy: 0.7328 - loss: 0.7788 - val_accuracy: 0.6986 - val_loss: 0.8554 - learning_rate: 5.0000e-04
Epoch 10/50
121/121 - 1s - 10ms/step - accuracy: 0.7523 - loss: 0.7209 - val_accuracy: 0.7033 - val_loss: 0.9057 - learning_rate: 5.0000e-04
Epoch 11/50
121/121 - 2s - 16ms/step - accuracy: 0.7855 - loss: 0.6291 - val_accuracy: 0.6893 - val_loss: 0.9858 - learning_rate: 5.0000e-04
Epoch 12/50
121/121 - 2s - 17ms/step - accuracy: 0.8050 - loss: 0.5553 - val_accuracy: 0.7874 - val_loss: 0.7178 - learning_rate: 2.5000e-04
Epoch 13/50
121/121 - 3s - 21ms/step - accuracy: 0.8102 - loss: 0.5413 - val_accuracy: 0.8178 - val_loss: 0.6263 - learning_rate: 2.5000e-04
Epoch 14/50
121/121 - 1s - 10ms/step - accuracy: 0.8238 - loss: 0.5014 - val_accuracy: 0.8084 - val_loss: 0.6707 - learning_rate: 2.5000e-04
Epoch 15/50
121/121 - 1s - 10ms/step - accuracy: 0.8300 - loss: 0.4853 - val_accuracy: 0.8271 - val_loss: 0.6401 - learning_rate: 2.5000e-04
Epoch 16/50
121/121 - 1s - 10ms/step - accuracy: 0.8479 - loss: 0.4497 - val_accuracy: 0.8271 - val_loss: 0.5921 - learning_rate: 1.2500e-04
Epoch 17/50
121/121 - 1s - 11ms/step - accuracy: 0.8594 - loss: 0.4118 - val_accuracy: 0.8201 - val_loss: 0.6096 - learning_rate: 1.2500e-04
Epoch 18/50
121/121 - 1s - 6ms/step - accuracy: 0.8617 - loss: 0.4021 - val_accuracy: 0.8318 - val_loss: 0.5873 - learning_rate: 1.2500e-04
Epoch 19/50
121/121 - 1s - 6ms/step - accuracy: 0.8653 - loss: 0.3971 - val_accuracy: 0.8318 - val_loss: 0.5913 - learning_rate: 1.2500e-04
Epoch 20/50
121/121 - 1s - 10ms/step - accuracy: 0.8659 - loss: 0.3864 - val_accuracy: 0.8107 - val_loss: 0.6136 - learning_rate: 1.2500e-04
Epoch 21/50
121/121 - 1s - 10ms/step - accuracy: 0.8617 - loss: 0.3762 - val_accuracy: 0.8178 - val_loss: 0.5905 - learning_rate: 6.2500e-05
Epoch 22/50
121/121 - 1s - 11ms/step - accuracy: 0.8690 - loss: 0.3670 - val_accuracy: 0.8341 - val_loss: 0.5914 - learning_rate: 6.2500e-05
Epoch 23/50
121/121 - 1s - 11ms/step - accuracy: 0.8757 - loss: 0.3622 - val_accuracy: 0.8271 - val_loss: 0.5862 - learning_rate: 3.1250e-05
Epoch 24/50
121/121 - 1s - 10ms/step - accuracy: 0.8752 - loss: 0.3552 - val_accuracy: 0.8341 - val_loss: 0.5776 - learning_rate: 3.1250e-05
Epoch 25/50
121/121 - 1s - 8ms/step - accuracy: 0.8726 - loss: 0.3597 - val_accuracy: 0.8341 - val_loss: 0.6003 - learning_rate: 3.1250e-05
Epoch 26/50
121/121 - 1s - 6ms/step - accuracy: 0.8807 - loss: 0.3481 - val_accuracy: 0.8294 - val_loss: 0.5695 - learning_rate: 3.1250e-05
Epoch 27/50
121/121 - 1s - 6ms/step - accuracy: 0.8786 - loss: 0.3482 - val_accuracy: 0.8294 - val_loss: 0.6118 - learning_rate: 3.1250e-05
Epoch 28/50
121/121 - 1s - 6ms/step - accuracy: 0.8778 - loss: 0.3558 - val_accuracy: 0.8341 - val_loss: 0.5732 - learning_rate: 3.1250e-05
Epoch 29/50
121/121 - 1s - 10ms/step - accuracy: 0.8828 - loss: 0.3325 - val_accuracy: 0.8294 - val_loss: 0.5729 - learning_rate: 1.5625e-05
Epoch 30/50
121/121 - 1s - 6ms/step - accuracy: 0.8825 - loss: 0.3349 - val_accuracy: 0.8364 - val_loss: 0.5890 - learning_rate: 1.5625e-05
Epoch 31/50
121/121 - 1s - 10ms/step - accuracy: 0.8755 - loss: 0.3481 - val_accuracy: 0.8341 - val_loss: 0.5763 - learning_rate: 7.8125e-06
In [101]:
plt.plot(history4.history['accuracy'])
plt.plot(history4.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()
In [102]:
accuracy = model4.evaluate(X_test_normalized, y_test_encoded, verbose=2)
15/15 - 0s - 27ms/step - accuracy: 0.8400 - loss: 0.5796
In [103]:
y_pred4 = model4.predict(X_test_normalized)
# Obtaining the categorical values from y_test_encoded and y_pred
y_pred4_arg=np.argmax(y_pred4,axis=1)
y_test_arg=np.argmax(y_test_encoded,axis=1)

# Plotting the Confusion Matrix using confusion matrix() function which is also predefined tensorflow module
confusion_matrix = tf.math.confusion_matrix(y_test_arg,y_pred4_arg)
f, ax = plt.subplots(figsize=(10, 8))
sns.heatmap(
    confusion_matrix,
    annot=True,
    linewidths=.4,
    fmt="d",
    square=True,
    ax=ax
)
plt.show()
15/15 ━━━━━━━━━━━━━━━━━━━━ 1s 23ms/step
In [104]:
cr4 =metrics.classification_report(y_test_arg,y_pred4_arg, output_dict=True)
f1_4 = cr4['macro avg']['f1-score']
acc_4 = cr4['accuracy']
print('f1-score:',f1_4)
print('Accuracy:', acc_4)
f1-score: 0.8159262460432947
Accuracy: 0.84
In [105]:
# clearing the backend and setting the seeds
backend.clear_session()

np.random.seed(1)
random.seed(1)
tf.random.set_seed(1)
In [106]:
datagen = ImageDataGenerator (horizontal_flip=True,
                                    vertical_flip=False,
                                    height_shift_range=0.1,
                                    width_shift_range=0.1,
                                    rotation_range=90,
                                    zoom_range=0.1,
                                    )
In [109]:
model_vgg = VGG16(weights='imagenet', include_top=False, input_shape=(224,224,3))
model_vgg.summary()

#alternatively if you are NOT taking all CNN layers and dropping only the fully connected side:
# this will let you specify the up-to-and-include cnn layer you want
# in this case 'block5_pool' is the last layer, so both options will return the same, but you can adjust the layer arg
# xfer_layer = model_vgg.get_layer('block5_pool')
# model_vgg = Model(inputs=model_vgg.input, outputs=xfer_layer.output)
Model: "vgg16"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                         ┃ Output Shape                ┃         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ input_layer (InputLayer)             │ (None, 224, 224, 3)         │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block1_conv1 (Conv2D)                │ (None, 224, 224, 64)        │           1,792 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block1_conv2 (Conv2D)                │ (None, 224, 224, 64)        │          36,928 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block1_pool (MaxPooling2D)           │ (None, 112, 112, 64)        │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block2_conv1 (Conv2D)                │ (None, 112, 112, 128)       │          73,856 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block2_conv2 (Conv2D)                │ (None, 112, 112, 128)       │         147,584 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block2_pool (MaxPooling2D)           │ (None, 56, 56, 128)         │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block3_conv1 (Conv2D)                │ (None, 56, 56, 256)         │         295,168 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block3_conv2 (Conv2D)                │ (None, 56, 56, 256)         │         590,080 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block3_conv3 (Conv2D)                │ (None, 56, 56, 256)         │         590,080 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block3_pool (MaxPooling2D)           │ (None, 28, 28, 256)         │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block4_conv1 (Conv2D)                │ (None, 28, 28, 512)         │       1,180,160 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block4_conv2 (Conv2D)                │ (None, 28, 28, 512)         │       2,359,808 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block4_conv3 (Conv2D)                │ (None, 28, 28, 512)         │       2,359,808 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block4_pool (MaxPooling2D)           │ (None, 14, 14, 512)         │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block5_conv1 (Conv2D)                │ (None, 14, 14, 512)         │       2,359,808 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block5_conv2 (Conv2D)                │ (None, 14, 14, 512)         │       2,359,808 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block5_conv3 (Conv2D)                │ (None, 14, 14, 512)         │       2,359,808 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block5_pool (MaxPooling2D)           │ (None, 7, 7, 512)           │               0 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 14,714,688 (56.13 MB)
 Trainable params: 14,714,688 (56.13 MB)
 Non-trainable params: 0 (0.00 B)
In [110]:
# Freeze all of the VGG layers so they are non-trainable
for layer in model_vgg.layers:
  layer.trainable=False
In [111]:
# verify that the xfered layers are frozen so they won't be retrained
for layer in model_vgg.layers:
  print(layer.name, layer.trainable)
input_layer False
block1_conv1 False
block1_conv2 False
block1_pool False
block2_conv1 False
block2_conv2 False
block2_pool False
block3_conv1 False
block3_conv2 False
block3_conv3 False
block3_pool False
block4_conv1 False
block4_conv2 False
block4_conv3 False
block4_pool False
block5_conv1 False
block5_conv2 False
block5_conv3 False
block5_pool False
In [112]:
# clearing the backend and setting the seeds
backend.clear_session()

np.random.seed(1)
random.seed(1)
tf.random.set_seed(1)
In [113]:
model6 = Sequential()

#start with the pre-trained vgg layers
model6.add(model_vgg)

# Flatten
model6.add(Flatten())

# add the dense layers from model4 above
# ANN layers
model6.add(Dense(64, activation='relu'))
model6.add(Dropout((0.25)))
model6.add(Dense(32, activation='relu'))
model6.add(Dropout((0.25)))
model6.add(Dense(16, activation='relu'))

#output layer
model6.add(Dense(12, activation='softmax'))
In [116]:
# clearing the backend and setting the seeds
backend.clear_session()

np.random.seed(1)
random.seed(1)
tf.random.set_seed(1)
In [118]:
model7 = Sequential()

#start with the pre-trained vgg layers
model7.add(model_vgg)

# conv layers from model4
model7.add(Conv2D(64, (3,3), activation='relu', padding='same'))
model7.add(MaxPooling2D(2,2))
model7.add(BatchNormalization())
model7.add(Conv2D(32, (3,3), activation='relu', padding='same'))
model7.add(MaxPooling2D(2,2))
model7.add(BatchNormalization())
model7.add(Conv2D(16, (3,3), activation='relu', padding='same'))
# model7.add(MaxPooling2D(2,2))
model7.add(BatchNormalization())
# Flatten
model7.add(Flatten())

# add the dense layers from model4 above
# ANN layers
model7.add(Dense(64, activation='relu'))
model7.add(Dropout((0.25)))
model7.add(Dense(32, activation='relu'))
model7.add(Dropout((0.25)))
model7.add(Dense(16, activation='relu'))

#output layer
model7.add(Dense(12, activation='softmax'))


# compile with parameters from model4 above
model7.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
#model7.summary()

Final Model¶

Comment on the final model you have selected and use the same in the below code to visualize the image.

In [121]:
cnn_compare = pd.DataFrame({'Model1':[f1_1, acc_1],
                            'Model2':[f1_2, acc_2],
                            'Model3':[f1_3, acc_3],
                            'Model4':[f1_4, acc_4]},
                            index=['f1', 'acc'])
cnn_compare
Out[121]:
Model1 Model2 Model3 Model4
f1 0.64 0.62 0.08 0.82
acc 0.71 0.65 0.14 0.84

Visualizing the prediction¶

In [62]:
accuracy = model4.evaluate(X_test_normalized, y_test_encoded, verbose=2)
15/15 - 0s - 8ms/step - accuracy: 0.8295 - loss: 0.5866
In [63]:
y_pred4 = model4.predict(X_test_normalized)
# Obtaining the categorical values from y_test_encoded and y_pred
y_pred4_arg=np.argmax(y_pred4,axis=1)
y_test_arg=np.argmax(y_test_encoded,axis=1)

# Plotting the Confusion Matrix using confusion matrix() function which is also predefined tensorflow module
confusion_matrix = tf.math.confusion_matrix(y_test_arg,y_pred4_arg)
f, ax = plt.subplots(figsize=(10, 8))
sns.heatmap(
    confusion_matrix,
    annot=True,
    linewidths=.4,
    fmt="d",
    square=True,
    ax=ax
)
plt.show()
15/15 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step 

Actionable Insights and Business Recommendations¶

I recommend using model 4. It is accurate to between 76 and 80%, which will drastically reduce the amount of manual classification work for farmers. This model has the added benefit of requiring little image preprocessing and no costly data augmentation. This model could be integrated with automatic weeding systems, allowing weeds to be targeted over crops, reduce the amount of pesticides used overall, and lead to more eco-friendly farming. Creating a mobile application for farmers could allow them to quickly identify an unknown plant from a photograph, allowing them to make decisions immediately.